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wwwJs.informatik.uni-duisburg.de/btb/pdf/ir/Grossjohann_etal:02.pdf- Slm^^^^^^^ 
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January 2006 The VLDB Journal — The International Journal on Very Large Data 

Bases, volume 15 Issue 1 

Publisher: Springer-Verlag New York.. Inc. 

Full text available: '^pdftMl-.lQ.K.B) Additional Information: fuLcitailonj abstract 

For querying structured and semistructured data, data retrieval and document retrieval 
are two valuable and complementary techniques that have not yet been fully integrated. 
In this paper, we introduce integrated information retrieval (IIR), an XML-based retrieval 
approach that closes this gap. We introduce the syntax and semantics of an extension of 
the XQuery language called XQuery/IR. The extended language realizes IIR and thereby 
allows users to formulate new kinds of queries by nesting rank ... 

Keywords: Data retrieval. Document retrieval. Index structures, Integrated information 
retrievals. Structural join, XML 
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The problem of finding nearest neighbors to a query in a document collection is a special 
case of associative retrieval, In which searches are performed using more than one key. A 
nearest neighbors associative retrieval algorithm, suitable for document retrieval using 
similarity matching, is described. The basic structure used is a binary tree, at each node a 
set of keys (concepts) is tested to select the most promising branch. Backtracking to 
initially rejected branches is allowed and ofte ... 
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Query languages for XML often use path expressions to locate elemerits in XML 
documents. Path expressions are regular expressions such that underlying alphabets 
represent conditions on nodes. Path expressions represent conditions on paths from the 
root, but do not represent conditions on siblings, siblings of ancestors, and descendants of 
such siblings. In order to capture such conditions, we propose to extend underlying 
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We consider merging structured documents, which is to transform given two distinct 
documents into isomorphic ones. Such merging is essential to synchronizing several 
copies of a document concurrently edited by several clients. Two documents, treated as 
ordered trees, are merged by applying a merge script consisting of add, del, upd, and 
move operations to the documents. We prove that the corresponding decision problem to 
finding an optimum merge script Is NP-complete. Then, ... 
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Research aimed at correcting words in text has focused on three progressively more 
difficult problems:(l) nonword error detection; (2) isolated-word error correction; and (3) 
context-dependent work correction. In response to the first problem, efficient pattern- 
matching and n-gram analysis techniques have been developed for detecting strings that 
do not appear in a given word list. In response to the second problem, a variety of 
general and application-specific spelling cor ... 
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XIRQL ("circle") is an XML query language that incorporates imprecision and vagueness 
for both structural and content-oriented query conditions. The corresponding uncertainty 
is handled by a consistent probabilistic model. The core features of XIRQL are (1) 
document ranking based on index term weighting, (2) specificity-oriented search for 
retrieving the most relevant parts of documents, (3) datatypes with vague predicates for 
dealing with specific types of content and (4) structural vagueness f ... 

Keywords: Path algebra, XML, XQuery, probabilistic retrieval, ranked retrieval, vague 
predicates 
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